The Similarity for Nominal Variables Based on F-Divergence
نویسنده
چکیده
Measuring the similarity between nominal variables is an important problem in data mining. It's the base to measure the similarity of data objects which contain nominal variables. There are two kinds of traditional methods for this task, the first one simply distinguish variables by same or not same while the second one measures the similarity based on co-occurrence with variables of other attributes. Though they perform well in some conditions, but are still not enough in accuracy. This paper proposes an algorithm to measure the similarity between nominal variables of the same attribute based on the fact that the similarity between nominal variables depends on the relationship between subsets which hold them in the same dataset. This algorithm use the difference of the distribution which is quantified by f-divergence to form feature vector of nominal variables. The theoretical analysis helps to choose the best metric from four most common used forms of f-divergence. Time complexity of the method is linear with the size of dataset and it makes this method suitable for processing the large-scale data. The experiments which use the derived similarity metrics with K-modes on extensive UCI datasets demonstrate the effectiveness of our proposed method.
منابع مشابه
To examine Increasing of nominal GNP according to monitarists & Kenziyan Approaches in Iranian Economy.
In this paper is attemped to examine increasing of nominal GNP according to momitarists and kenziyan approaches. In this research we have used time series Data collcted from Database of Iranian centeral Bank "between"1981 to 2015. Variables in this study are: Gross national Production (GNP) based on current prices, Supply of money, and Government Expenditures. Microsoft Eviwse.9 have used in th...
متن کاملEvaluation of Similarity Measures for Template Matching
Image matching is a critical process in various photogrammetry, computer vision and remote sensing applications such as image registration, 3D model reconstruction, change detection, image fusion, pattern recognition, autonomous navigation, and digital elevation model (DEM) generation and orientation. The primary goal of the image matching process is to establish the correspondence between two ...
متن کاملEffects of Fiscal and Monetary Policies on the Iranian Economy: An Optimal Control Approach
This paper evaluates the interacted effects of the fiscal and monetary policies on the nominal and real macro-variables of the Iranian economy. Our analysis is thus based on the optimal control theory by which the optimal path of the control variables including monetary and fiscal tools are determined over the period 1963-2006. We also use a macro-econometric model in form of a simultaneous equ...
متن کاملA note on decision making in medical investigations using new divergence measures for intuitionistic fuzzy sets
Srivastava and Maheshwari (Iranian Journal of Fuzzy Systems 13(1)(2016) 25-44) introduced a new divergence measure for intuitionisticfuzzy sets (IFSs). The properties of the proposed divergence measurewere studied and the efficiency of the proposed divergence measurein the context of medical diagnosis was also demonstrated. In thisnote, we point out some errors in ...
متن کاملCritical Pair Analysis in Nominal Rewriting
Nominal rewriting (Fernández, Gabbay & Mackie, 2004; Fernández & Gabbay, 2007) is a framework that extends first-order term rewriting by a binding mechanism based on the nominal approach (Gabbay & Pitts, 2002; Pitts, 2003). In this paper, we investigate confluence properties of nominal rewriting, following the study of orthogonal systems in (Suzuki et al., 2015), but here we treat systems in wh...
متن کامل